Skip to main content

Week 7


  • Experiments with CLIP
    • draw confusion matrices comparing the errors on CLIP and Tesseract classification - result: CLIP is better
  • Set up a git repository - shrivastava95/clip-ocr for fine-tuning CLIP onto a given dataset.

Screenshots / Videos



  • Learnt about OpenAI's CLIP model, a zero-shot model for measuring semantic similarity between image and text pairs.
  • This is done using cosine similarity of their projections onto a common embedding space.